Collection of Tools & Utilities

home *** CD-ROM | disk | FTP | other *** search

/ Collection of Tools & Utilities / Collection of Tools and Utilities.iso / asmutil / a86v400.zip / A10.DOC < prev next >

Wrap

Text File | 1994-12-21 | 33KB | 689 lines

CHAPTER 10 RELOCATION AND LINKAGE A86 allows you to produce either .COM files, which can be run immediately as standalone programs, or .OBJ files, to be fed to the MS-DOS LINK program. In this chapter I'll discuss .OBJ mode of A86. .OBJ Production Made Easy I'll start by giving you the minimum amount of information you need to know to produce .OBJ files. If you are writing short interface routines, and do not want to concern yourself with the esoterica of .OBJ files (segments, groups, publics, etc.), you can survive quite nicely by reading only this section. There are two ways you can cause A86 to produce a .OBJ file as its object output. One way is to explicitly give .OBJ as the output file name: for example, you can assemble the source file FOO.8 by giving the command "A86 FOO.8 FOO.OBJ". The other way is to specify the switch +O (letter O not digit 0). This is illustrated by the invocation "A86 +O FOO.8", which will have the same effect as the first invocation. My design philosophy for .OBJ production is to accommodate two types of user. The first type of user is writing new code, to link to other (usually high level language) modules. That person should be able to write the module with a minimum of red tape, and have A86 do the right thing. The second type of user has existing modules written for Intel/IBM assemblers, and wants to port them to A86. A86 should recognize and act upon all the relocation directives (SEGMENT, GROUP, PUBLIC, EXTRN, NAME, END) given. The assembly should work even if several files, assembled separately under the Intel/IBM assembler, are fed to a single A86 assembly. You'll see if you read on through this entire chapter that the multiple-files requirement causes A86 to interpret some of the relocation directives a little differently (while achieving compatible results). Let's suppose you're writing new code: for example, an interface routine to the "C" language, that multiplies a 16-bit number by 10. "C" pushes the input number onto the stack, before calling your routine. Your code needs to get the number, multiply it by 10, and return the answer in the AX register. You can code it: _MUL10: ; "C" expects all public names to start with "_" PUSH BP ; "C" expects BP to be preserved MOV BP,SP ; we use BP to address the stack MOV AX,[BP+4] ; fetch the number N, beyond BP and the ret addr ADD AX,AX ; 2N MOV BX,AX ; 2N is saved in BX ADD AX,AX ; 4N ADD AX,AX ; 8N ADD AX,BX ; 8N + 2N = 10N POP BP ; BP is restored RET ; go back to caller 10-2 These 11 lines can be your entire source file! If you name the file MUL10.8, A86 will create an object file MUL10.OBJ, that conforms to the standard SMALL model of computation for high level languages. If you use RETF instead of RET (thus, by the way, getting the operand from BP+6 instead of BP+4), the object module will conform to the standard LARGE model of computation. All the red tape information required by the high level language is provided implicitly by A86. I'll go through this information in detail later, but you should need to read about it only if you're curious. What happens if you need to access symbols outside the module you're assembling? If the type of the symbol is correctly guessed from the instruction that refers to it, then you can simply refer to it, and leave it undefined within the module. For example, if A86 sees the instruction CALL PRINT with PRINT undefined, it will assume that PRINT is a NEAR procedure. If PRINT is never defined within the module, A86 will act as if you declared PRINT via the directive EXTRN PRINT:NEAR. The address of PRINT will be plugged into your instruction by LINK when it combines A86's .OBJ file with the high level language's .OBJ files, to make the final program. In general, the undefined operand to any CALL or JMP instruction is assumed to be NEAR. The second (source) operand to a MOV or arithmetic instruction is assumed to be ABS (i.e., an immediate constant). An undefined first (destination) operand is assumed to be a simple memory variable, of the same size (BYTE or WORD) as the register given in the second operand. If your external symbol does not comply with these guidelines, you need to declare it with an EXTRN before you use it. (You can also use EXTRN to declare types of non-complying forward references within your module, as you'll see later.) If you'd like to link the MUL10 procedure to Turbo Pascal V4.0 or later, you need to append the line CODE SEGMENT PUBLIC to the top of the program, to name the program segment according to Turbo Pascal's expectations. You may dispense with the leading underscore in the name MUL10-- Turbo Pascal does not require or expect it. At this point, if you're a casual user, I think you've read enough to get going! Read further only if you wish; or if you get stuck, and need to master the esoterica. 10-3 Overview of Relocation and Linkage When you assemble a program directly into a .COM file, the program has just two forms: the source program, that you can understand, and the .COM file, that the computer can "understand" (i.e., execute). A .OBJ file is an intermediate format: neither you nor the (executing) computer can make sense out of a .OBJ file; only programs like LINK interpret .OBJ files. The purpose of a .OBJ file is to allow you to assemble or compile just a part of a program. The other parts (also in the form of .OBJ files) can be produced at a different time; often by a different assembler or compiler, whose source files are in a different language. It's easy to see where the word "linkage" comes from: the LINK program puts the pieces of a program together. The "relocation" comes because the assembler or compiler that makes a given program piece doesn't know how many other pieces will come before it, or how big the other pieces will be. Each piece is constructed as if it started at location 0 within the program; then LINK "relocates" the piece to its true location. Many of the relocation features of 86 assembly language are couched in terms of LINK's point of view, so we must look at the way LINK sees things. LINK calls a .OBJ file an "object module", or just "module". Each module has a NAME, that can be referred to when LINK issues diagnostic messages, such as error messages and symbol maps. If a program symbol is used only within a single module, it does not need to be given to LINK, except possibly to pass along to a symbolic debugger. On the other hand, if a program symbol is defined in one module and referenced in other modules, then LINK needs to know the name of the symbol, so it can resolve the references. Such a symbol is PUBLIC in the module in which it is defined; it is "external" in the other modules, containing references to it. Finally, exactly one module in a program must contain the starting location for the program; that module is called the "main module", and it must supply the starting address (which is not necessarily at the beginning of the module). In the 86 family of microprocessors, the LINK system also does much to manage the memory segments that a program will fit into, and get its data from. The (grotesquely ornate) level of support for segmentation was dictated by Intel, when it specified (and IBM and the compiler makers accepted) the format that .OBJ files will have. I attended the fateful meeting at Intel, in which the crucial design decisions were made. I regret to say that I sat quietly, while engineers more senior than I applied their fertile imaginations to construct fanciful scenarios which they felt had to be supported by LINK. Let's now review the resulting segmentation model. 10-4 The parts of a program, as viewed by LINK, come in three different sizes: they